-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlstats: only include local region in statement_statistics #102192
Conversation
I've re-enabled The other two skipped tests, ./dev test --stress pkg/ccl/multiregionccl/ -f=TestMultiRegionDataDriven/regional_by_row --timeout=10m
./dev test --stress pkg/ccl/multiregionccl/ -f=TestMultiRegionDataDriven/regional_by_table --timeout=10m
|
Part of #89949. Addresses #98020. Addresses #99563. Related to cockroachdb/roachperf#129. Related to #102170. Previously, we attempted to record all the regions hit in a single statement execution in the sqlstats tables, leaning on the sqlAddressResolver to map traced nodeIDs to localities at execution time. While the sqlAddressResolver is generally non-blocking, the introduction of this code did cause some of the multiregion "this query shouldn't span regions" tests to start [flaking][] and it's more recently been [implicated][] in a 2.5% performance regression. Given that the probabilistic nature of the tracing meant that we generally weren't capturing all the relevant nodeIDs anyway, it seems like the most prudent thing to do here is take a step back and regroup. In the short term, let's stop even trying to gather all these regions. In the medium/long term, let's see if we can find a better approach. [flaking]: #98020 [implicated]: https://github.com/cockroachdb/roachperf/pull/129 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2)
bors r=j82w |
Build failed: |
bors r=j82w |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 0e19563 to blathers/backport-release-23.1-102192: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from 0e19563 to blathers/backport-release-23.1.0-102192: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.0 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
@@ -189,7 +186,11 @@ func (ex *connExecutor) recordStatementSummary( | |||
} | |||
|
|||
nodes := util.CombineUnique(getNodesFromPlanner(planner), []int64{nodeID}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally: ExtractNodesFromSpan returns an unordered set already (intsets.Fast). Then ForEach converts it into a sequence. Then the caller uses util.CombineUnique to transform the sequence back into a set. This is just silly and these steps should be combined.
(cc @j82w )
Part of #89949.
Addresses #98020.
Addresses #99563.
Related to cockroachdb/roachperf#129.
Related to #102170.
Previously, we attempted to record all the regions hit in a single statement execution in the sqlstats tables, leaning on the sqlAddressResolver to map traced nodeIDs to localities at execution time.
While the sqlAddressResolver is generally non-blocking, the introduction of this code did cause some of the multiregion "this query shouldn't span regions" tests to start flaking and it's more recently been implicated in a 2.5% performance regression.
Given that the probabilistic nature of the tracing meant that we generally weren't capturing all the relevant nodeIDs anyway, it seems like the most prudent thing to do here is take a step back and regroup.
In the short term, let's stop even trying to gather all these regions. In the medium/long term, let's see if we can find a better approach.
Release note: None